08. Video: Dummy Variables

Dummy Variables

The way that we add categorical variables into our multiple linear regression models is by using dummy variables. The most common way dummy variables are added is through 1, 0 encoding. In this encoding method, you create a new column for each level of a category (in this case A, B, or C). Then our new columns either hold a 1 or 0 depending on the presence of the level in the original column.

When we add these dummy variables to our multiple linear regression models, we always drop one of the columns. The column you drop is called the baseline. The coefficients you obtain from the output of your multiple linear regression models are then an indication of how the encoded levels compare to the baseline level (the dropped level).